Multi-keyword spotting of telephone speech using orthogonal transform-based SBR and RNN prosodic model
نویسندگان
چکیده
In this paper, orthogonal transform-based signal bias removal (OTSBR) approach and RNN prosodic model are proposed for multi-keyword spotting of telephone speech. OTSBR is employed in the pre-processing stage of acoustic decoding and aimed at channel bias estimation to eliminate the acoustic mismatch between training and testing environments. The RNN prosodic model is adopted in the post-processing stage of the acoustic decoding to detect word boundaries for reordering the keyword candidates from the keyword spotter. Simulations on the real speech database collected from the Phone Directory Assistant Service developed in Chunghwa Telecommunication Laboratories (CTL-PDAS) were performed to evaluate the proposed methods. Experimental results showed that 71.0% of keyword detection rate and 81.8% of top 5 keywords inclusion rate can be attained by incorporating OTSBR and RNN prosodic model into the system.
منابع مشابه
Telephone speech multi-keyword spotting using fuzzy search algorithm and prosodic verification
In this paper a fuzzy search algorithm is proposed to deal with the recognition error for telephone speech. Since the prosodic information is a very special and important feature for Mandarin speech, we integrate the prosodic information into keyword verification. For multi-keyword detection, we define a keyword relation and a weighting function for reasonable keyword combinations. In the keywo...
متن کاملUtterance Verification Using Prosodic Information for Mandarin Telephone Speech Keyword Spotting - Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference o
In this paper, the prosodic information, a very special and important feature in Mandarin speech, is used for Mandarin telephone speech utterance verification. A two-stage strategy, with recognition followed by verification, is adopted. For keyword recognition, 59 context-independent subsyllables, i.e., 22 m s and 37 FINAL’S in Mandarin speech, and one backgroundkilence model, are used as the b...
متن کاملUtterance verification using prosodic information for Mandarin telephone speech keyword spotting
In this paper, the prosodic information, a very special and important feature in Mandarin speech, is used for Mandarin telephone speech utterance verification. A two-stage strategy, with recognition followed by verification, is adopted. For keyword recognition, 59 context-independent subsyllables, i.e., 22 INITIAL’s and 37 FINAL’s in Mandarin speech, and one background/silence model, are used a...
متن کاملRobust Multi-Keyword Spotting of Telephone Speech Using Stochastic Matching
In telephone speech recognition, the acoustic mismatch between the training and the test environment often causes severe degradation due to the channel distortion and ambient noise. In this paper, a two-level codebook-based stochastic matching (CBSM) is proposed to deal with the acoustic mismatch. For multi-keyword detection, we define a keyword relation table and a weighting function for reaso...
متن کاملComparison of keyword spotting methods for searching in speech
This paper presents and discusses keyword spotting methods for searching in speech. In contrast with searching in text, the searching in speech or generally in multimedia data still represents a challenge. The aim of the paper is to present a keyword spotting (KWS) method based on a large vocabulary continuous speech recognition (LVCSR) system, based on phonetics decoder, and keyword spotting u...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001